Crowdsourced Accessibility: Elicitation of Wikipedia Articles

نویسندگان

  • Scott Novotney
  • Chris Callison-Burch
چکیده

Mechanical Turk is useful for generating complex speech resources like conversational speech transcription. In this work, we explore the next step of eliciting narrations of Wikipedia articles to improve accessibility for low-literacy users. This task proves a useful test-bed to implement qualitative vetting of workers based on difficult to define metrics like narrative quality. Working with the Mechanical Turk API, we collected sample narrations, had other Turkers rate these samples and then granted access to full narration HITs depending on aggregate quality. While narrating full articles proved too onerous a task to be viable, using other Turkers to perform vetting was very successful. Elicitation is possible on Mechanical Turk, but it should conform to suggested best practices of simple tasks that can be completed in a streamlined workflow.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Shared Task: Crowdsourced Accessibility Elicitation of Wikipedia Articles

Mechanical Turk is useful for generating complex speech resources like conversational speech transcription. In this work, we explore the next step of eliciting narrations of Wikipedia articles to improve accessibility for low-literacy users. This task proves a useful test-bed to implement qualitative vetting of workers based on difficult to define metrics like narrative quality. Working with th...

متن کامل

Crowdsourcing elicitation data for semantic typologies

In semantic typology, it is desirable to have quick and easy access to crosslinguistic elicitations describing stimuli from a semantic domain. We explore the use of crowdsourcing for obtaining such data, and compare it with fieldwork data obtained through in-person elicitations. Despite potential concerns about the quality of crowdsourced data, we find no difference in the amount of between-lan...

متن کامل

Making your database available through Wikipedia: the pros and cons

Wikipedia, the online encyclopedia, is the most famous wiki in use today. It contains over 3.7 million pages of content; with many pages written on scientific subject matters that include peer-reviewed citations, yet are written in an accessible manner and generally reflect the consensus opinion of the community. In this, the 19th Annual Database Issue of Nucleic Acids Research, there are 11 ar...

متن کامل

Advertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles

When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...

متن کامل

Handling Information Overload: Automatic Generation of Wikipedia Articles

The exponential growth of information on the web over the years has lead to the problem of information overload, i.e. amount of information present on web is beyond the processing capacity of any system. Thus, a need arises to have a single resource to properly cover as well as to have an up to date information about a topic. The popular website Wikipedia does the same in a structured manner th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010